Categories

Versions

Filter Examples with Missing Values (Operator Toolbox)

Synopsis

This operator filters examples with missing values.

Description

This operator filters examples with missing values. The filter method can be defined with the parameter filter method. Possibilities are: all examples are kept; only examples with a maximum number of attributes with missing values or a maximum relative number of attributes with missing values are kept; all examples with at least one non-missing attribute value are kept. The thresholds for the absolute and relative number can be defined by the parameters maximum number of missings and maximum relative number of missings.

Special attributes are ignored in the filter by default. The condition to filter an example is only evaluated on the regular attributes. If the parameter include special attributes is selected, special attributes are also included in the evaluation of the filter condition.

Input

  • example set (Data Table)

    Input ExampleSet which is filtered.

Output

  • filtered example set (Data Table)

    Filtered ExampleSet with only the examples which fulfill the filter.

  • original (Data Table)

    The original ExampleSet.

Parameters

  • filter_method

    This parameter allows you to select the filter method; the method you want to use to filter examples with missing values. It has the following options:

    • keep all: This option keeps all examples of the ExampleSet.
    • one or more non-missing: All examples are kept, which have at least one non-missing attribute value. Examples with only missing values are removed. By default special attributes are ignored for this condition. To include special attributes select the parameter include special attributes
    • maximum number missing: All examples which have a maximum number attributes with missing values are kept. The maximum number can be specified by the parameter maximum number of missings. By default special attributes are ignored for this condition. To include special attributes select the parameter include special attributes.
    • maximum relative number missing: All examples which have a maximum relative number of attributes with missing values are kept. The relative maximum number can be specified by the parameter maximum relative number of missings. By default special attributes are ignored for this condition. To include special attributes select the parameter include special attributes.
    Range:
  • maximum_number_of_missings

    Only examples are kept where the number of attributes with missing values is smaller or equal to this parameter.

    Range:
  • maximum_relative_number_of_missings

    Only examples are kept where the relative number of attributes with missing values to all attributes is smaller or equal to this parameter.

    Range:
  • invert_selection

    If selected, the filter is inverted, all examples which fulfill the filter condition are removed, the others are kept.

    Range:
  • include_special_attributes

    Special attributes are attributes with special roles. These are: id, label, prediction, cluster, weight and batch. Also custom roles can be assigned to attributes. By default all special attributes are kept. If this parameter is set to true, the filter is also applied on special attributes.

    Range:

Tutorial Processes

Demonstration of different filter methods